Toric Ideals of Homogeneous Phylogenetic Models 



Nicholas Eriksson 

Department of Mathematics 
University of California, Berkeley 
Berkeley, CA 94720-3840 

eriksson@math.berkeley.edu 



ABSTRACT 

We consider the model of phylogenetic trees in which every 
node of the tree is an observed, binary random variable and 
the transition probabilities are given by the same matrix on 
each edge of the tree. The ideal of invariants of this model 
is a toric ideal in C[pii...i„]. We are able to compute the 
Grobner basis and minimal generating set for this ideal for 
trees with up to 11 nodes. These are the first non-trivial 
Grobner bases calculations in 2^^ = 2048 indeterminates. 
We conjecture that there is a quadratic Grobner basis for 
binary trees, but that generators of degree n are required 
for some trees with n nodes. The polytopes associated with 
these toric ideals display interesting finiteness properties. 
We describe the polytope for an infinite family of binary 
trees and conjecture (based on extensive computations) that 
there is a universal bound on the number of vertices of the 
polytope of a binary tree. 

1. INTRODUCTION 

A phylogenetic tree is a rooted tree T on n nodes with a k- 
ary random variable Xi associated to every node. Write p{v) 
for the parent of node v. Then the transition probabilities 
between p(u) and v are given by a k by k matrix A*-"' for 
every non-root node of T. 

In an application, k might encode the four nucleic acids 
that make up DNA, the two families of nucleic acids, or the 
twenty amino acids. The transition matrices are generally 
picked from some specific family such as the Jukes-Cantor 
Kimura |1L)I . or general Markov models |Tj. 

In this paper we consider the homogeneous Markov model 
where all yl'"' are equal, all nodes are binary (k — 2) and 
observable, and the root has uniform distribution. We write 

A*"' = A = (°'°° °'°^] . The probability of observing i at 
\fflio fflii J 

a node v is computed from the parent of v by 

P(X„ = j) = ao,P(Xp(„) = 0) + ai,P(Xp(„) = 1). 



We are interested in the algebraic relations satisfied by the 
joint distribution 

Pili2...i„ '■— P{Xl = Zl, . . . , X„ = in)- 

Writing the joint distribution in terms of the model param- 
eters aoo, ctoi, ctio, an, we have 

n 

i=2 

where the nodes of the tree are labeled 1 to n starting with 
the root. That is, the probability of observing a certain la- 
beling of the tree is the product of the a^j that correspond 
to the transitions on all edges of the tree. The indetermi- 
nates aij parameterize a toric variety of dimension 4 in K'^ . 
We let It be the corresponding toric ideal, called the ideal 
of phylogenetic invariants. In the notation of 1111 . the toric 
ideal It is specified by the 4 by 2" configuration At, where 
column (ii, . . . , i„) consists of the exponent vector of the atj 
in We order the rows (aoo, aoi, aio, an). Let Pt be the 
convex hull of the columns of At- 

We are interested in two questions from 8 . First, which 
relations on the joint probabilities Pij^...i„ does the model 
imply? This problem is solved by giving generators of the 
ideal of invariants It- 

In Section 2, we study the generators of this ideal. Our 
main accomplishment is the computation of Grobner and 
Markov bases for trees with 11 nodes. These are compu- 
tations in 2048 indeterminants, which we believe to be the 
largest number of indeterminants ever in a Grobner basis 
calculation. We also calculate generating sets for all trees 
on at most 9 nodes. Based on this evidence, we conjecture 
that if T is binary, then the ideal It has a quadratic gen- 
erating set, and furthermore, that relations of degree n are 
necessary to generate It for certain trees with n nodes. 

Our second goal is to determine, given a labeling of the tree 
T, if we can identify parameters aij such that the labeling is 
the most likely among all labelings? This problem is solved 
by computing the normal fan of the toric variety in the sense 

of H. 

In Section 3, we study this normal fan and the polytope Pt- 
Our main result. Theorem Q is an explicit description of 
the polytope Pt for an infinite family of binary trees. For 
this family, Pt always has 8 vertices and 6 facets which we 
characterize. We also present extensive calculations of Pt 



for various trees and conjecture that there is a bound on the 
number of vertices of Pt as T ranges over all binary trees. 

The invariants vanish for a given distribution ) essen- 

tially when that distribution comes from our model. Thus 
the knowledge of the generators of this ideal is potentially 
very useful for fitting biological sequence data to a phylo- 
genetic tree, as first noted by Cavender and Felsenstein 
While there has been much progress towards finding the 
ideal of invariants for other phylogenetic models (see 0, 
0, Cni), the homo geneous model is particularly attractive 
because the low number of parameters makes it possible to 
compute non-trivial examples. Hopefully we can use the ho- 
mogeneous model to approximate in some sense the general 
model, perhaps by subdividing edges of the tree. 



Example 1. Let T be a path with 3 nodes. Then 



and so we see that 
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the polytope Pt has 7 vertices and 6 facets, and the toric 
ideal of the path of length 3 is generated by 6 binomials 

It = (2:101 — a;oio, Kooia;ioo — xqqoXqw, 

a;oiia;ioo — 2;ooia;iio, 2:oiia;iio — 3;oioa;iii, 



2 22 
2;ooia;iii — 2;oooa;oii 1 3^ioo2;iii 



a;oooa;iio)- 



2. TORIC IDEALS 

The toric ideals It are homogeneous, since all monomials in 
10 have the same degree n — 1. Thus they define projective 
toric varieties Yt- Algebraic geometers usually require a 
toric variety to be normal, but the reader should be warned 
that the toric varieties discussed in this paper are generally 
not normal. 

Recall that a projective toric variety given by a configuration 
A — (ai, . . . , afc) is covered by the affine toric varieties given 
by ^ — ai . An affine toric variety defined by a configuration 
A is said to be smooth if the semigroup NA is isomorphic 
to N*" for some r ^ Lemma 2.2]. 

Proposition 1. The projective toric variety Yt of a bi- 
nary tree T is not smooth. 

Proof. Recall that the columns of the configuration At 
are indexed by 0/1-labelings of the tree T. Look at the affine 
chart /^-ao.,.o: where ao...o corresponds to the all zero tree. 
On this chart, write a^ = a; — ao...o. Let 10 ... be the tree 
with a 1 at the root and zeros everywhere else, ... 01 be 
the tree with a 1 at a single leaf and zeros everywhere else, 
and . . . 010 ... be the tree with a single 1 at the parent of 
a leaf and zeros elsewhere. Then since ao...o ~ {n~l, 0, 0, 0), 
we have 

aio...o = (n - 3, 0, 2, 0) - ao...o = (-2, 0, 2, 0) 
ao...oi = (Ji - 2, 1, 0, 0) - ao...o = (-1, 1, 0, 0) 
ao...oio...o = (n - 4, 1, 2, 0) - ao...o = (-3, 1, 2, 0), 



Therefore, A — ao...o is not isomorphic to N"" and the toric 
variety Yt is not smooth. □ 



We are primarily interested in the generators of the ideals 
It- Knowledge of the generators would allow us to easily 
compute whether given data came from the homogeneous 
Markov model from some specific phylogenetic tree. 

Using 4ti2 [Q, Grobner and Markov bases for the ideal It 
were computed for all trees with at most 9 nodes as well 
as selected trees with 10 and 11 nodes. This took about 6 
weeks of computer time in total on a 2GHz computer. The 
computations in 2048 variables (trees with 11 nodes) each 
took as long as a week and required over 2 GB of memory. 

Details about the Markov bases for all binary trees with at 
most 11 nodes are shown in Table 1. These computations 
lead us to make the following conjectures. 



Conjecture 1. The toric ideal corresponding to a bi- 
nary tree is generated in degree 2. More generally, tf every 
non-leaf node of the tree has the same number of children d 
(for d >2), the toric ideal is generated in degree 2. 



Conjecture 2. There exists a quadratic Grobner basis 
for the toric ideal of a binary tree. 

Using the Grobner Walk !^ implementation in magma, we 
have computed thousands of Grobner bases for random term 
orders for the smallest binary trees. It doesn't seem to be 
possible to compute the entire Grobner fan for these ex- 
amples with CaTS [Sj, but the random computations have 
yielded some information: Conjecture 2 is tree for the bi- 
nary tree with 5 nodes, in fact, there are at least 4 distinct 
quadratic Grobner bases for this tree. Analysis of these 
bases lends some optimism towards Conjecture 2. However, 
for the binary trees on 7 nodes, computation of over 1000 
Grobner bases did not find a quadratic basis. The best ba- 
sis found contained quartics and some bases even contained 
relations of degree 29. 

Another nice family of toric ideals is given by It for T a 
path of length n. Table 2 presents data for Markov bases of 
paths that leads us to conjecture that this family also has 
well behaved ideals. 



Conjecture 3 
is generated in 
needed. 



The toric ideal corresponding to a path 
', with 2n — 4 generators of degree 3 



Unfortunately, the toric ideal of a general tree doesn't seem 
to have such simple structure. For n < 9, the trees with 
highest degree minimal generators are those of the form 



These trees require generators of degree n. 



tree 



Degree 
of It 



28 
92 

96 

210 

220 

210 
412 
404 

400 

412 

412 



#Minimal 
Generators 



79 
441 

561 

2141 

2068 

2266 
7121 
7131 

7137 

7551 

7551 



Max degree 
of generator 



404 7561 



Table 1: Degree of It, number of minimal genera- 
tors, and maximum degree of the generators 



3. POLYTOPES 

In this section, we are interested in the following problem. 
Given any observation (ii, . . . , in) of the tree, which matrices 
A — {uij) make pij^...i„ maximal among the coordinates of 
the distribution p? 

To solve this problem, transform to logarithmic coordinates 
bij — log(aij). Then the condition that Pi^...in > Pii...i„ 
for all {li, . . . In) £ {0, 1}" is translated into the the linear 
system of inequalities 



H 1- 



for all {h, ■ ■ ■ In) £ {0, 1}". The set of solutions to these in- 
equalities is a polyhedral cone. For most values of ii, . . . , i„, 
this cone will be empty. Those sequences ii, . . . ,i„ for which 
the cone is maximal are called Viterbi sequences. The collec- 
tion of the cones, as (ii, . . . , i„) varies, is the normal fan of 
the polytope Pt, where Pt is the convex hull of the columns 
of At- 

Notice that Pt is a polytope in R^. However, since all the 
monomials in Q are of degree n — 1, we see that this poly- 
tope is actually contained in n — 1 times the unit simplex in 
R''. Thus, Pt is actually a 3 dimensional polytope. We call 
Pt the Viterbi polytope. 

The polytopes Pt show remarkable finiteness properties as 
T varies. Since Pt is defined as the convex hull of 2" vectors, 
it would seem that it could have arbitrarily bad structure. 
However, as it is contained in n — 1 times the unit simplex, it 
can be shown that there are at most 0{n^'^) integral points 
in Pt- 



Example 2- Eric Kuo has shown [7| that if T is a path 
with n nodes, then Pt has only two combinatorial types for 
n > 3, depending only on the parity of n. The polytope 
for the path with 7 nodes is shown in Figure 1. Think of 
this picture as roughly a tetrahedron with the vertex cor- 
responding to all — > 1 transitions and the vertex with all 
1^0 transitions both sliced ofi' (since if a path has a — > 1 
transition it must have a 1 ^ a; transition). 
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Table 2: Degree of It, size of Markov basis, maxi- 
mum degree of a minimal generator, and number of 
degree 3 generators for paths 



Two facts from Example|5|are important to remember. First, 
the structure of the polytope is related more to the topol- 
ogy of the tree than the size of the tree. Second, there is 
a distinction between even and odd length paths. We call 
a binary tree completely odd if the tree has all leaves at an 

i 

odd distance from the root. For example, the tree is 
completely odd. 

Theorem 1. Let T be a completely odd binary tree with 
more than three nodes. The associated polytope Pt always 
has the same combinatorial type with 8 vertices and 6 facets 
(see Figure 2)- 

Proof. First, we derive six inequalities that are satis- 
fied by any binary tree, deriving a "universal" polytope for 
binary trees. Then we show that a completely odd binary 




Figure 1: Pt for T a path with 7 nodes, after pro- 
jecting onto the first three coordinates (&oo, 601, feio)- 



tree has labelings that give us aU vertices of the "universal" 
polytope. 

Thinking of the polytope space as the log space of the pa- 
rameters aij, we write with coordinates fooo, &01, 610, 
Since Pt lies in n — 1 times the unit simplex in R**, we have 
600 + boi + bio + bii = n — 1 and the 4 inequalities bij > 0. 
We claim that any binary tree T satisfies two additional 
inequalities 



600 — 601 , , ^ n + 1 

+ bio < 



2 

611 - bio 



+ boi < 



2 ' 
n+1 



(2) 
(3) 



We prove (j^J , the second inequality follows by interchanging 
1 and 0. 

Fix a labeling of the binary tree. We claim that the left hand 
side of counts the number of zeros that are "created" 
while moving down the tree, that is, it counts the number 
of leaves that are zero minus one if the root is labeled zero. 
Pick a non-leaf of the tree which is labeled "0" . It has two 
children. If both are "0", then this node contributes 2 to 
boo — 610 ■ If both are "1", then this node contributes -2 to 
boo — bio- If one is "0" and one is "1", then the node doesn't 
contribute. We think of a "0" node with two "0" children 
as having created a new zero and a "0" node with two "1" 
children as having deleted a zero. Therefore we see that the 
term (600 — 610 )/2 counts the number of zeros created as 
children of "0" nodes. Similarly, if a non-leaf is labeled "1", 
then its contribution to 610 counts the number of new zeros 
in the children. 

Since there are ^i^i leaves in a binary tree, there can be 
at most ^2±i zeros created, so J^J holds. Notice that the 
labelings that lie on this facet are exactly those with a one 
at the root and all zeros at the leaves. 

These six inequalities and the equality 600 + 601 -l-feio + 611 = 
n—1 define a three dimensional polytope in R*. We compute 



that there are eight vertices of this polytope: 
(n- 1,0, 0,0), (n- 3, 0,2,0) 



n — 3 n + 1 



n-3 2n 
' 3 ' 3 ' 



2n n — 3 



0,0, 



n + 1 n — 3 



2 ' 2 
(0,2,0, n-3), (0,0,0,n-l) 

Six of these vertices occur in any binary tree: a tree with 
all zeros gives the (n — 1,0,0,0) vertex, a tree with a one 
at the root and zeros elsewhere gives (n — 3,0,2,0), and 
a tree with ones at the leaves and zeros elsewhere gives 
(^^^, ^^^^,0, 0). Interchanging 1 and gives three more ver- 
tices. However, the remaining two vertices aren't obtained 
by all binary trees. 

The vertex (0, ^i^, ^,0) lies on the facet defined by ©, 
so we know it must have a one at the root, all zeros at the 
leaves, and the labels must alternate going down the tree 
since there are no zero to zero or one to one transitions. This 
means that this vertex is representable by a labeled tree if 
and only if the tree has all leaves at an odd depth from the 
root. Notice that this implies that n must be divisible by 3 
for the tree to be completely odd. Finally, if n > 3 is odd 
and divisible by 3, then n > 9 and one checks that the eight 
vertices are distinct. 

See Figure 2 for a picture of the polytope and a Schlegel 
diagram with descriptions of the labelings on the facets and 
at the vertices. □ 



In the case where T is binary but not completely odd, the 
polytope shares 6 vertices with this universal polytope, but 
the remaining 2 vertices are either not integral or not re- 
alizable. However, the polytope still shares much of the 
boundary with the universal polytope, so it is perhaps re- 
alistic to expect that the polytope for a general binary tree 
behaves well. Table 3 shows data from computations for all 
binary trees with at most 23 nodes. The maximum number 
of vertices of Pt appears to grows very slowly with the size 
of the tree. 

Although binary trees seem to generally have polytopes with 



Figure 2: The polytope of the completely odd binary tree and a Schlegel diagram of this polytope with facets 
and vertices labeled. 




Figure 3: A tree T with 15 nodes for which Pt has 
34 vertices, 58 edges, and 26 faces. 

few vertices, arbitrary trees are not so nice. For example, 
Figure 3 shows a tree with 15 nodes that has a polytope 
with 34 vertices. 

Table 4 shows data for all trees on at most 15 nodes. It 
appears that the maximum number of vertices for the poly- 
tope of an arbitrary tree of size n grows approximately as 
2n. Notice that the tree with all leaves at depth 1 has Pt 
a tetrahedron, giving the unique minimum number, 4, of 
vertices for all trees. 

Conjecture 4. There is a bound on the number of ver- 
tices of Pt, where T ranges over all binary trees. However, 
for an arbitrary tree, the number of vertices of Pt is un- 
bounded. 

To extend these computations, a better algorithm for com- 
puting Pt needs to be developed. The naive algorithm for 
computing Pt involves a loop of size 2" , elimination of dupli- 
cates points, and a convex hull computation. This algorithm 



Number 


Number of 


Min 


Max 


Ave 


of nodes 


trees 


vertices 


vertices 


vertices 


3 


2 


4 


7 


5.5 


4 


4 


4 


8 


7 


5 


9 


4 


11 


8 


6 


20 


4 


14 


9.7 


7 


48 


4 


15 


10.75 


8 


115 


4 


20 


12.59 


9 


286 


4 


21 


13.67 


10 


719 


4 


22 


15.42 


11 


1842 


4 


25 


16.60 


12 


4766 


4 


28 


18.3 


13 


12486 


4 


31 


19.5 


14 


32973 


4 


32 


19.75 


15 


87811 


4 


34 


22.6 



Table 4: Minimum, maximum and average number 
of vertices of Pt over all trees with at most 15 nodes 



can certainly be improved, but it is not known whether there 
is a polynomial time algorithm for constructing the polytope 
given a tree. Is there a fast algorithm that, given a tree T 
and a point of R*, outputs whether that point arises from a 
labeling of T? If so, then Pt could be constructed by testing 
the 0{n^'^) points inside n — 1 times the unit simplex. 
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